Bayesian adaptation of speech recognizers to field speech data
نویسندگان
چکیده
This work studies a Bayesian (or Maximum A Posteriori MAP) approach to the adaptation of Continuous Density Hidden Markov Models (CDHMMs) to a specific condition of a speech recognition application. In order to improve the model robustness, CDHMMs formerly trained from laboratory data are then adapted using context dependent field utterances. Two specific problems have to be faced when using the MAP approach: the estimation of the a priori distribution parameters and the lack of field adaptation data for some distributions of the CDHMM. To estimate the a priori distribution parameters, we need to identify different realizations of the model parameters. Three different solutions are proposed and evaluated. To overcome the lack of adaptation data, field acoustical training frames may be shared among similar distributions. This is performed using an acoustical tree, obtained by progressively clustering the model distributions. Recognition results show that MAP adapted models significantly outperform those trained by Maximum Likelihood (ML), specifically when the field data set is small.
منابع مشابه
Unsupervised adaptive speech technology for limited resource languages: a case study for Tamil
This paper evaluates adaptive speech technology for creating low cost, rapidly deployable speech recognizers for new languages with very limited data. A multi-modal (speech and touch) dialog system in Tamil, which delivered agricultural information to rural villagers, is described. Based on the field recordings from this system, a number of automatic speech recognition (ASR) adaptation techniqu...
متن کاملTransformation-based Bayesian predictive classification using online prior evolution
The mismatch between training and testing environments makes the necessity of speech recognizers to be adaptive both in acoustic modeling and decision making. Accordingly, the speech hidden Markov models (HMMs) should be able to incrementally capture the evolving statistics of environments using online available data. Also, it is necessary for speech recognizers to exploit the robust decision s...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملTelephone speech recognition using neural networks and hidden Markov models
The performance of well trained speech recognizers using high quality full bandwidth speech data is usually degraded when used in real world environments In particular telephone speech recognition is extremely di cult due to the limited bandwidth of transmission channels In this paper neural network based adaptation methods are applied to telephone speech recognition and a new unsupervised mode...
متن کامل